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Abstract. In this paper we present an improved fitting formula for the dark matter bispectrum 
motivated by the previous phenomenological approach of Ref [1]. We use a set of LCDM simulations 
I to calibrate the fitting parameters in the /c-range of 0.03/i/Mpc < k < 0.4/i/Mpc and in the redshift 

range of < z < 1.5. This new proposed fit describes well the B AO-features although it was not 
" ^ I designed to. The deviation between the simulations output and our analytic prediction is typically 

I> less than 5% and in the worst case is never above 10%. We envision that this new analytic fitting 

formula will be very useful in providing reliable predictions for the non-linear dark matter bispectrum 
^> for LCDM models. 
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1 Introduction 

The dark matter and galaxy power spectrum have been widely used to study the growth of structure, 
to constrain cosmological parameters and galaxy bias models. These tools have proved very successful 
and have contributed to crystallize the current LCDM model e.g., [2] and refs therein. With ongoing 
and forthcoming galaxy surveys, like BOSS^ and EUCLID^, the signal-to- noise of the data will increase 
and the uncertainties around this model will be reduced. Higher precision data will allow the use of 
not only the two-point correlation function, but also of higher-order statistics, in order to constrain 
and improve our theories and models. The bispectrum (the three-point correlation function in Fourier 
space) is naturally the next statistic to consider [3, 4]. Using both the power spectrum and bispectrum 
we can improve our knowledge of the growth of structure and galaxy biasing [5-13], constrain possible 
departures from Gaussianity in the initial conditions of the matter density field [14-18] as well as 
constrain departures from GR e.g., [19, 20]. 

From a theoretical point of view, perturbation theory and subsequent improvements such as 
renormalized perturbation theory [21], resummed perturbation theory or time-RG flow [22] is a phys- 
ically well-motivated approach to study these statistical moments. Tree-level perturbation theory has 
demonstrated to describe well the behavior of the power spectrum and bispectrum at large scales. 
However, non-trivial computations are needed to obtain predictions at non-linear scales: the one-loop 
correction and beyond, for the power spectrum and bispectrum. For the power spectrum, how- 
ever, other phenomenological approaches have been demonstrated to work better for a wide range of 
redshifts and different cosmologies e.g., [23-25]. For the bispectrum there are also simple phenomeno- 
logical models that predict its behavior at non- linear scales [1], but they fail to accurately reproduce 
the BAO-features [26] and are only precise at the 20%-30% level. Therefore better analytical models 
are needed to describe the bispectrum at these non-linear scales. 

In this paper we improve the phenomenological description presented by [1] (hereafter SC) more 
than 10 years ago. Using a set of modern simulations we fit the free parameters of our proposed 
analytic formula. Thus, we obtain an improved description for the bispectrum in the LCDM model 
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scenario (including baryonic acoustic oscillations) in a range of 0.03/i/Mpc < k < 0.4ft,/Mpc and for 
different rcdshifts, < z < 1.5. 

This paper is organized as follows: in §2 we begin with a description of the density field statistics 
and different analytic approaches to the dark matter bispectrum. In §3 we describe the simulations we 
use to fit the parameters. In §4 we present our results, compare them with previous fitting formulae 
and with 1-loop corrections and discuss the differences. We finally conclude in §5. In Appendix A, 
we give details of how the bispectrum and its errors are computed from simulations. In Appendix B 
we test how our formula works for other non-standard LCDM models. In Appendix C we present a 
short description of 1-loop correction in Eulcrian perturbation theory. 

2 Theory 

2.1 Power spectrum & bispectrum 

The power spectrum P{k), the Fourier transform of the two-point correlation function, is one of the 
simplest statistics of interest one can extract from the dark matter overdensity field (5(k), 

(5(k)J(k')) = (2^)=^(5^(k + k')F(fc) , (2.1) 

where denotes the Dirac delta function and (...) the ensemble average over different realizations 
of the Universe. Under the assumption of an isotropic Universe, the power spectrum does not depend 
on the direction of the k-vector. Since we only have one observable Universe, the average (...) is 
taken over all different directions for each k-vector. Under the hypothesis of ergodicity both averages 
will yield the same result. 

The second statistic of interest is the bispectrum B, defined by, 

(^(ki)<5(k2)5(k3)) = (2^)35^(ki + k2 + k3)B(ki, k2, kg). (2.2) 

The Dirac delta function ensures that the bispectrum is defined only for k-vector configurations that 
form closed triangles: k^ = 0. Note that 6(^.1)6(^.2)5(^3) is in general a complex number, however 
once the average is taken, the imaginary part goes to zero. 

It is convenient to define the reduced bispectrum Q123 = (5(ki,k2,k3) as, 

o B(k,,kM 

- P(k,)P(k2) + P(ki)P(k3) + P(k2)P(k3) ^ ■ ' 

which takes away part of the dependence on scale and cosmology^. The reduced bispectrum is useful 
when comparing different models, since it only has a weak dependence on cosmology and one can thus 
break degeneracies between cosmological parameters in order to isolate the effects of gravity. 

The bispectrum for Gaussian initial conditions is zero and remains zero in linear theory, i.e. as 
long as the fc- modes evolve independently'*. However, when non-linearities start to play an important 
role, mode coupling is no longer negligible and the bispectrum becomes non-zero. Thus, by measuring 
the bispectrum one can extract information about how non-linear processes influence the evolution of 
dark matter clustering. 

2.2 Analytic approaches in the literature 

In order to understand the observational data, we need accurate theoretical predictions for B(ki , k2, fca). 
A physically well-motivated analytic theory for doing this, is perturbation theory (PT hereafter) (see 
[27] for a review) or subsequent improvement such as renormalized PT, resummed PT etc. 

In an Einstein de-Sitter Universe (hereafter EdS Universe) and at second order (tree-level) in 
Eulerian perturbation theory, the bispectrum is given by [28], 

B123 = 2^^2'(ki, k2)Pf'P2'^ + eye. perm., (2.4) 

^For equilateral configuration and up to tree level, Q does not depend on cosmology or scale. 

*Wick theorem states that the n-point correlation function of a Gaussian field is always zero when n is an odd 
number. 



- 2 - 



where B123 — i3(ki, k2, ka), P/" = P^{ki) is the hnear power spectrum, and the symmetrized two- 
point kernel is given by 

i^|(k„k,) =^- + \ cos(%) + y + ^ cos2(e,,), (2.5) 

where Oij is the angle between the vectors kj and k^. This formula is the second order perturbation 
theory contribution to the bispectrum which is the leading order contribution. On quasi-linear scales, 
this expression is a very good prediction but fails in the moderate non-linear regime. The dependence 
on cosmology of the two-point kernel F| is very weak and hence the cosmology dependence of the 
bispectrum is almost completely contained in P/". Because of this, in this work, we use the kernel of 
Eq. 2.5 even though we are dealing with the LCDM model. 

One can improve the tree-level PT prediction by going one step further and including one-loop 
corrections. However, at this point the computation of the bispectrum becomes cumbersome. For an 
initially Gaussian 5-field this yields four additional terms to the tree-level contribution (see Appendix 
C for details). 

An alternative way of reaching these non-linear scales, without using the one-loop correction, 
and to even push beyond the one-loop regime of validity, is with phenomenologically motivated mod- 
els. Phenomenological formulae can give simpler expressions in the non-linear regime and accurate 
predictions for the bispectrum. However, their physical motivation is limited and they usually have 
free parameters that need to be calibrated using N-body simulations. 

SC proposed a fitting formula based on the structure of the formula of Eq. 2.4. It consists 
in replacing the linear power spectrum by the non-linear one in Eq. 2.4 and the EdS two-point 
symmetrized kernel by 



(k„k,) = ^a(n„fc,)a(n,,fc,) (2.6) 

1 / k k \ 2 

-I- - cos{9,j) + -j^j b{n,, k,)b{nj,kj) + - cos^{9ij)c{ni,ki)c{n.j,kj), 

where the functions a(n, k), b{n, k) and c(n, k) are chosen to interpolate between the tree-level results 
and the hyper-extended perturbation theory regime (HEFT) [29] , 

( M_ l + arW[0-7Q3(n)]^/'(gai)"+'^^ 

Cll 7?/, K ) — 



1 + (gai)"+'^2 

n + 3)5"+'"^ 
^n+3.5 ' 

1 + 4.5a4/[1.5 + [n + 3)4](ga5)"+^ 



1 I A 

c{n, k) 



1 + (ga5)"+3-5 
Here n is the slope of the linear power spectrum at fc, 

_ d\ogP^{k) 



(2.8) 



dlog k 

and q = k/k^i, where k^i is the scale where non-linearities start to be important and is defined as, 

^y^''^ ^ 1; (2.9) 

tti are free parameters that must be fitted using data from simulations. In particular, SC propose the 
values, 

oi = 0.25, 02 = 3.5, 03 = 2, a4 = 1, = 2, ag = —0.2 . 
The function Qs^n) is given by 

4 — 2" 

Q^i^) = YT2^i- (2.10) 
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With all these changes, the SC approach reads, 

Si23 - 2F|ff(ki, k2)FiP2 + eye. perm., (2.11) 

where Pi is the non- linear power spectrum at fc^. On large scales, where the functions a, b and c 
— > 1 we recover the tree-level PT formula for the bispectrum. On the other hand, on small scales 
— 7> (7/10)(53 and b and c — > and we obtain Q123 — > Qsin), which is the prediction of HEPT. 

Another approach based on phenomenological formulae, is the one presented by [26]. The main 
idea is to rescale the linear formula of the bispectrum, by using some scale transformation in k. 
This way, the tree-level formula can easily be extended up to non-linear scales using the ansatz 
= [1 + A%^{h)]~'^^h; where A%^{k) = Pik)ky{2Tr^). 

This approach has by definition the drawback that it does not preserve the BAO-features of the 
bispectrum. In particular, the rescaling of k produces a spurious rescaling of the peaks and troughs of 
the BAO wiggles that do not match with the data, producing higher deviations than the SC approach. 
Because of that, we do not consider this approach in this paper. 

2.3 Our analytic formula 

Our approach in this paper is inspired by the SC approach. It consists of not only refitting the 
parameters from Eq. 2.7 but of also modifying their expression to make it more suitable for current 
precision N-body data and consider the redshift range of < z < 1.5. In order to do that, we use 
simulations with more particles, larger box sizes, and more realizations (and thus higher precision, 
better statistics and better error-control) with respect to previous works; we also consider snapshots 
at different redshifts. In order to improve the fitting precision, we also add 3 more parameters to the 
original model. The modified functions a{n, k), b{n, fc), c(n, k) then read, 

l + <^(z)[0.7g3(n)]i/^(gai)"+°^ 
1 -t- (gai)"+"2 



d{n, k) 



bin k) l + 0-2«3(n + 3)(ga7)"+^^+°- 



c(n, k) 



1 + 4.5a4/[1.5 + (n + 3)*]{qa5)''+^+''<^ 



1 + (9a5)"+3-5+"9 

Note that one recovers the original SC formulae in the limit of 07 — > 1 and ag, ag — > 0. 

The original SC formula was not designed to reproduce the BAO features. Applying this formula 
to a power spectrum with BAOs produces unphysical oscillations. These oscillations are much larger 
than those observed in simulations (see black dashed line in the right panel of Fig. 1). These 
oscillations are caused by the oscillatory behavior of the slope parameter n. One solution to this 
problem is to "dewiggle" the linear power spectrum [30] . However here we want to preserve the BAO 
oscillations. We propose to smooth the oscillatory behavior of the parameter n by means of splines, 
as is shown in the blue solid line of the left panel of Fig. 1. This provides an improved fit to the 
BAO-features, as it is shown by the blue solid line in the right panel of Fig. 1. In order to smooth out 
n we calculate its spline by taking a number of points n{k), where the points are chosen to be in the 
middle of the amplitude of each wiggle, such that when the points are connected a smooth line would 
pass through them. These points are used in the spline routine, and their second order derivatives are 
calculated for each point k. This output is then fed into the spline routine, which returns a smoothed 
value of n for each value of k. 

Our method consists in using this smoothed n and refit all the free parameters from Eq. 
2.12 using the reduced bispectrum data from N-body simulations. In particular, we use the following 
triangle configurations at different redshifts: Oi2/tt = 0.1, 0.2, . . . , 0.9, fc2/fci — 1.0,1.5,2.0,2.5 and 
z = 0,0.5,1,1.5. 

3 Simulations 

The simulations in this paper consist of two different sets, namely A and B. Each simulation is 
characterized by the box size, Lf,, the number of particles, Np, and the number of independent 
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Figure 1. Left panel: The slope n{k) (Eq. 2.8) from the linear power spectrum without smoothing (black 
dashed hne) and with a spline smoothing (blue solid line). Right panel: Q{ki) for k2/k\ = 2 and 9\2 = 0.6n. 
Red circles are data from simulations A and red squares from simulations B (see Table 1 for details on the 
simulations). Black dashed line is SC prediction without any spline in n{k) and blue solid line with the spline 
in n{k). 





A 


B 


Lb [Mpc/h] 


2400 


1875 




7683 


10243 




40 


3 


kN/4: [h/Mpc] 


0.25 


0.43 


softening e [kpc//i] 


90 


40 


PM grid 


20483 


20483 


ErrTolForceAcc a 


0.005 


0.005 


initial scale factor 


0.05 


0.02 


maximum A log a 


0.025 


0.025 


ErrTolIntAccuracy rj 


0.025 


0.025 


# time steps 


- 1300 


- 2500 



Table 1. Simulations details for simulations A and B. Lb is the box size, Np is the number of particles, Nr is 
the number of independent realizations. A quarter of the Nyquist frequency, fcjv/4, is the upper threshold to 
which we trust the simulation results at the percent level. The force resolution is specified by the softening 
parameter e and the Particle Mesh (PM) grid size. The short-range force accuracy is determined by a through 
the cell-opening criterion Ml^ > Q|aoid|''*, where M is the mass inside the cell, / its side length, aoid the total 
acceleration of the particle in the previous time step, and r the distance between the particle and the cell. 
The remaining parameters set the time stepping in Gadget-2: the maximum global time step in the logarithm 
of the scale factor, max(Aloga), and the parameter rj in the individual time step criterion Aa — y''2rje/\a\, 
where a is the acceleration of the individual particle. 



runs. Nr. Details about the two simulations are given in Table 1. As a rule of thumb, a maximum 
threshold in k for trusting the simulation data is set by a quarter of the Nyquist frequency, defined as 

1 /3 

fcjv/4 = nNp' /{4:Lb). At this scale it has been observed that the power spectrum starts to deviate at 
the 1%-level with respect to higher resolution simulations [31]. We confirmed this result using our two 
sets of simulations. For all the plots and results shown in this paper this limit in k is never exceeded. 

Both A and B simulations consist in a flat LCDM cosmology with cosmological parameters 
consistent with observational data. The cosmology used is J^a = 0.73, flm — 0.27 h — 0.7, ilbh^ — 
0.023, Us — 0.95 and 178(2: = 0) = 0.7913. The initial conditions were generated at z = 19 and 
z = 49 for simulations A and B respectively, by displacing the particles according to the second-order 
Lagrangian PT from their initial grid points. The initial power spectrum of the density fluctuations 
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ai = 0.484 aa = 3.740 03 = -0.849 
04 = 0.392 05 = 1.013 ae = -0.575 
ar = 0.128 ag = -0.722 ag = -0.926 



Table 2. Best-fit parameters (according to Eq. 2.12) derived by combining data from simulations A and B, 
using different triangle configurations 612/71 = 0.1, 0.2, . . . , 0.9 and ^2/^1 — 1.0,1.5,2.0,2.5 and at different 
redshifts z = 0,0.5, 1.0, 1.5. 

was computed by CAMB [32]. Taking only the gravitational interaction into account, the simulation 
was performed with GADGET-2 code [33]. 

As we estimate the error of the bispectrum from its dispersion among different realizations (see 
Eq. A. 5 in Appendix A for details), and given that we only have 3 simulations of type B, we divide 
each of these 3 boxes into 8 sub-boxes. Each of these 24 sub-boxes is then treated as if it were 
an independent realization with smaller box size, LJ, = 937.5 /i/Mpc, where each of these sub-boxes 
contains about 512"^ particles. The measurements of the bispectrum from sub-boxes suffer from two 
issues: a) the measurements are not completely independent and more importantly b) the sub-boxes 
are affected by modes larger than sub-box size. As a consequence of this, a new source of non-Gaussian 
errors arises for the power spectrum and bispectrum estimation, called beat-coupling effect [34-36]. 
However, by using the mean density measured in each sub-box instead of the global mean density 
for the normahzation of the density contrast, S = p/p~ 1, this effect gets strongly suppressed [37]. 
Hence, we expect that on overlapping scales the bispectrum errors estimated from simulation B to be 
slightly larger than those from A. This is shown in Appendix A. 

In order to obtain the dark matter field from particles we discretize each box of simulation A 
and each sub-box of simulation B using 512'^ grid cells. Thus the size of the grid cells is 4.68 Mpc//i 
in A and 1.83 Mpc//i in B. We assign the particles to the cells using the count-in-cells prescription. 

More details about the estimation of the bispectrum from simulations and the error bars com- 
putation are given respectively in Eq. A. 4 and A. 5 in Appendix A. 

4 Results 

In order to find the best-fit parameters from Eq. 2.12, namely a^, we minimize 

i 

using a set of triangle configurations: fca/fci = 1.0, 1.5, 2.0, 2.5 and Qx^ji^ — 0.1, 0.2, . . . , 0.9, at different 
redshifts: z = 0, 0.5, 1.0, 1.5^. The algorithm used for the minimization is amoeba [38]. In our analysis 
we neglect that the errors of the data points are correlated. However, since our errors are small 
(typically less than 5%) we expect that error correlations do not play an important role for this 
method: the dominating source of the error of our fitting formula given in Eq. 2.12 comes from 
the imperfection of the functional form of the fitting formula and not from the uncertainties in the 
simulation data. We have checked that the result converges from different starting points. The 
resulting best-fit values are shown in Table 2. 

We have also checked that this is a very good fit not only for the reduced bispectrum Q but also 
for the bispectrum B. 

In Fig. 2 and 3 we show the results of our fit and also the predictions of two other models for 
different triangles configurations for 2 = (Fig. 2) and for z = 1 (Fig. 3). These two models are 
1-loop Eulerian PT (see Appendix C) and the SC method (Eq. 2.11) -I- smoothed-n. In each plot we 
show the reduced bispectrum Q vs. k\ for: N-body data (black circles for simulations A and black 
squares for simulations B), 1-loop correction (red solid line corresponding to data of simulations A 
and red dashed line to data of simulations B), SC formula (green solid line for A and green dashed 

^ For simulation A the z used are 2 = 0, 0.5, 1.0, 1.5 whereas for simulation B 2 = 0, 0.42, 1.0, 1.5 
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line for B) and our model (blue solid and dashed line for A and B respectively). In the bottom part 
of each panel we show the deviation of these models with respect to the N-body data: red symbols 
depict the deviation of the 1-loop prediction with respect to the data, green symbols are the deviation 
of the SC formula and blue symbols are the deviation of our model. Circles are the deviation with 
respect to simulation A and squares with respect to simulation B. The error bars show the error 
of the bispectrum measured from the simulations (see Appendix A for details). Each panel shows 
a different triangle configuration: from left to right fc2/fci = 1.0, 1.5, 2.0 and from top to bottom 
012/ = 0.2, 0.4, 0.6, 0.8. In order to avoid sample variance effects, we only use data points with 
ki > 0.03 h/Mpc for simulation A and ki > 0.09 ft./Mpc for simulation B. 

For some triangle configurations, we observe a mismatch between the data points in the over- 
lapping region of simulations A and B . This is due to the fact that the data points do not exactly 
correspond to the same triangle configurations. The central value for ki is the same for both simula- 
tions in each panel. However, since A and B have different fundamental frequencies kf = 27r/Lf, and 
we take the bin width to be Afc — 3kf, the fc-space over which we average the bispectrum for each 
bin is different (see Appendix A for further explanation). Hence, when one computes the effective ki 
using Eq. A. 8 one obtains that, especially for elongated triangle configurations, the simulations do 
not represent the same triangle configuration. Since the theoretical predictions (both 1-loop, SC and 
our model) are computed from ki there is also a mismatching between solid and dashed lines for the 
same reason. This effect is also noted in [30]. 

We have checked that this mismatch is not due to the fact that the two simulations have different 
resolution but to the fact that different triangle configurations are sampled. When the triangle 
configurations and scales coincide, we do not observe any mismatch. Note also that in the cases 
where a mismatch appears -due to different configurations being sampled-, it is also present (and 
quantitatively similar) in the the 1-loop theory prediction (see for example top left panel of Fig. 3). 

At z = 0, the deviation of our model from the data is typically less than 5% and always less 
than 10% for k < 0.4 h/Mpc, whereas for SC the deviation reaches values of up to 20% and 1-loop 
clearly breaks down for fc > 0.1 h/Mpc. We also observe that for triangles close to equilateral, both 
the SC approach and our work present maximum differences to the N-body data. This might have 
to do with the fact that for equilateral (or close to equilateral) triangles the 3 sides enter the non- 
linear regime at the same time, and thus, non-linearities play a stronger role than for other triangle 
configurations, where each side enters the non-linear regime at different redshifts. Because of this, for 
other configurations the differences are smaller and remain within 5% deviation. 

For z = 1 all models work better because non-linearities do not play such an important role. 
However, even in this regime, our model works better than the other two. Again, for triangle config- 
urations close to equilateral, our model reaches its maximum deviation of about 10%. At z = 1, all 
other triangle configurations typically have errors within 5%. 

As a cross-check, in Appendix B we compare our model with another set of simulations of non- 
standard LCDM model. In particular for equilateral configurations, our formula reaches deviations 
up to 10%. However for the scalene configurations k2 — 2fci the deviations are only of order 3%. In 
the cases studied here our model improves significantly the SC fitting formula. 

5 Conclusions 

In this paper we propose a new simple formula to compute the dark matter bispectrum in the mod- 
erate non-linear regime (fc < 0.4/i/Mpc) and for redshifts z < 1.5. Our method is inspired by the 
approach presented in [1], but includes a modification of the original formulae, namely Eq. 2.12, and 
a prescription to better describe the BAO oscillations. The cosmology dependence of the reduced 
bispectrum is known to be very weak, and that of the bispectrum is almost completely contained in 
the power spectrum. Given that the cosmological model today is well constrained by observations we 
have considered a single cosmology here. 

Using LCDM simulations we fit the free parameters of our model. We end up with a simple 
analytic formula that is able to predict accurately the bispectrum for a LCDM Universe including 
the effects of BAO. Our main results are summarized by Eq. 2.11 where the kernel is given by Eq. 
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Figure 2. In the main panels: Q vs. fci for N-body simulation (black circles for simulation A and black 
squares for simulation B), for 1-loop correction (red solid lines to fit simulation A data and dashed line to fit 
simulation B), for SC formula (green solid line to fit A and green dashed line to fit B) and for our model (blue 
solid line to fit A and blue dashed line to fit B) at z = 0. In the sub-panels, the ratio between simulations and 
different theoretical models is shown: f-loop (red points), SC (green points) and our model (blue points) for 
different triangles configurations. Circle symbols comes from simulation A and squares symbol from simulation 
B. The error bars show the measured errors from the simulations. Dashed lines mark 5% deviation from data. 
From left to right panels: k2/ki — 1.0, 1.5,2.0. From top to bottom panels: 612/11 = 0.2,0.4,0.6,0.8. 
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2.6, the functions a,b,c are now given by a,b,c of Eq. 2.12, the fitting coefScients take the values 
reported in Table 2 and the function Q3{n) is still given by Eq. 2.10. The local slope of the linear 
power spectrum n{k) is not any more given directly by Eq. 2.8 but is a smoothed (BAO-free, but 
with the same broadband behavior) function of k. 
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Figure 3. Q{k\) for different triangles configurations at z = 1. Same notation that in Fig. 2. 
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The main conclusions of our work are listed below. 

1. Our method is able to predict the dark matter bispectrum for a wide range of triangle con- 
figurations up to fc OA h/Mpc and for a redshift range < z < 1.5. In particular, for the 
reduced bispectrum, our fitting formula agrees within 5% with N-body data for most of the 
triangle configurations and always within 10% for the worst cases. This presents a considerable 
improvement over previous phenomenological approaches and over the prediction of Eulerian 
perturbation theory. 

2. The equilateral and quasi-equilateral configurations are the ones for which our model deviates 
most strongly from N-body data. We interpret this as being due to the fact that when the 3 
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sides of the triangle are similar, non-linearities start to play a role at the same time, and thus, 
the effect on the bispectrum is stronger than when the non-linearities enter at different times, i.e. 
for elongated triangles. Other methods, like the one described by SC show the same behavior. 

3. We have checked that our model also works well for non-standard LCDM cosmologies (see 
Appendix B). In particular, we have checked that for k2 = 2ki the deviation between N-body 
data and our model is never higher than 3% and for equilateral triangles reaches 10%. Also 
in these non-standard LCDM cases studied here, our model works better than the SC fitting 
formula. 

We envision that this new analytic fitting formula will be very useful in providing a reliable 
prediction for the non-linear dark matter bispectrum for LCDM models. In particular, simple analytic 
predictions with high accuracy will be needed for the data analysis in the forthcoming era of precision 
data. 
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A Appendix: bispectrum estimator & error bars 

Here we present details on the computation of the bispectrum and its error bars from N-body simula- 
tions. Moreover, we compare our error estimates with the Gaussian analytic predictions and discuss 
the differences. 

We start by defining the estimator for the bispectrum as, 

S(ki,k2,k3) = ^ / [ Sq2 I d^(5D(qi +q2 + q3)'59i<55,(5,3 (A.l) 

Vb Jki Jk2 Jka 

where Vf = {2tt)^/LI = kj is the volume of the fundamental cell, kf. The integration is defined 
over the bin ki — Aki/2 < qi < ki + Afci/2. In this paper we always take Afc = 3kf. Vb is the 
six-dimensional volume of triangles defined by the triangle sizes fci, ^2 and fca with uncertainty AA:. 
Its value can be approximated by 

VB{ki,k2,k3)^ 1 d\i I d\2 rf\3<5D(qi +q2 + q3) - STT^fci/ca/csAA:^ (A.2) 

J ki J k^ J k^ 

which is good enough for not too small values of ki. The variance associated to this estimator depends 
on higher-order correlation functions: up to the 6-point connected correlation function. However, the 
main contribution to the variance is given by the power spectrum. Assuming that the fields are 
Gaussian, the variance associated to estimator presented above is [39], 

AB2(ki,k2,k3) = SB^{27rfP{ki)P{k2)P{k3) (A.3) 
Vb 

where the symmetry factor is ss = 6, 2, 1 for equilateral, isosceles or scalene configurations. The 
factor (27r)'^ comes from our definition of the power spectrum and bispectrum in Eq. 2.1 and 2.2. 
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On the other hand, the discretized version of this estimator used in this paper is, B 

i3(ki,k2,k3) ^^Y.^^ [S^ih)Sf{k2)Sf{k,)] (A.4) 

J 

where Ntri is the number of random triangle configurations used to compute the bispectrum; and 
j runs over these triangle configurations. For this work we use a number of random triangles that 
increases with k in the same way as the number of fundamental triangles: ^ Vb/V^. It reaches up 
to Ntri ~ 10^ for scales k ~ 0.4ft./Mpc. We have checked that increasing the number of random 
triangles beyond this value has no effect neither in the value of the bispectrum nor in its error. 
The index d in the 6 field stands for a discrete and dimensionless quantity. Therefore the quantity 
Re [Sj (ki)S^ {k2)Sj (k^)^ needs to be rescaled with the factor to make B matching with the definition 

of the bispectrum in Eq. 2.2. We compute the variance of this estimator B by the sample variance 
derived from the Nr realizations, 

AB'(ki, k2, ka) = ^ J2 k2, kg) - (B(ki, ks, kg))) (A.5) 

i 

where Bi is the bispectrum derived from the realization i and (B) is the mean over all realizations, 

1 ^ 

(i?(ki,k2,k3)) = — ^i3,(ki,k2,k3). (A.6) 

i 

The error on the mean (B) is then simply given by 

a^g^=AB/^r- (A.7) 

When comparing the measured N-body bispectrum with theoretical models and also when com- 
paring Eq. A. 3 with Eq. A.5, it is important to take into account the effect on the finite size of the 
triangle bins: each configuration is defined in terms of the sides of the triangle ki ± A/e/2. In this 
case, we are assuming Afc — 3kf and a large number of fundamental triangles fit into this bin. For 
certain configurations, it turns out that we have more triangles with k larger than the central value, 
ki. Because of that, one must correct the sides of the triangles by 

^ = ^E^^ (A.8) 

j 

where i = 1,2,3 for each dimension and the sum is taken over all random triangle generated in the 
bin. This correction is extremely important at large scales and for very squeezed triangles, and less 
important for equilateral configuration. 

In Fig 4 we present a comparison of the error estimation from theoretical models (Eq. A. 3) and 
simulations (Eq. A.5). In the left panel we show the error of the bispectrum associated to a volume of 
1 single box for Afc = 3fc/: using the theoretical model and the simulations for the case of fc2/^i = 1 
and 9i2 — O.Gtt triangle configurations. The blue line shows the theoretical model prediction for the 
error of the bispectrum of one single realization using the non-linear power spectra from simulations 
A and B (Eq. A. 3); whereas the red squares and green circles show the dispersion among the runs of 
simulations A and B respectively (Eq. A.5). In the right panel the ratio between the errors according 
to the simulations and the Gaussian prediction is plotted for simulations A (red squares) and B (green 
circles). The error estimates of simulations A agrees well with the theoretical model. On the other 
hand, on small scales the error estimates of simulations B is larger than the theoretical model and 
further increase with decreasing scale. Similar results were found by [40]. These differences are due 
to the fact that Eq. A. 3 neglects any higher-order contributions (because it assumes Gaussianity). 
However, at small scales this is no longer a good approximation as it is shown in [36]. Furthermore, 
the errors of simulations B have been estimated by dividing each of the 3 simulation boxes into 8 
sub-boxes. This introduces extra non-Gaussian terms [36] that are not taken into account in Eq. A. 3. 
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k1 [h/Mpc] k1 [h/Mpc] 



Figure 4. Left panel: AB for ki/ki — 1 and 612 = 0.6n triangles as a function of ki derived from simulations 
A (red squares) and from simulations B (green circles). Blue line is the theoretical prediction according to 
Eq. A. 3. Right panel: AB/ AB for simulations A (red squares) and B (green circles) 



B Appendix: our fitting formula for non-standard LCDM models 

Here we test how our model works with different LCDM simulations to those we have used to fit 
the ai parameters. In particular, we test our model with LCDM simulations with a /(i?)-like power 
spectrum. For a full description of the simulations and the f{R) gravity we refer the reader to [19]. 
These simulations were run with ENZO code and have slightly different cosmology than the ones 
used in the rest of this paper: 51a — 0.76, 51,„ — 0.24, il^ = 0.04181, h — 0.73. They consist of 6 
realizations that contain 256'^ particles in a box of 400 Mpc/h per side. The one-quarter Nyquist 
frequency is fcAr/4 = 0.5 ft,/Mpc. 

In Fig. 5 the reduced bispectrum Q is shown: in the right panel as a function of the angle 
between ki and k2, namely ^12, for k2 = 2fci; in the left panel as a function of ki for equilateral 
configuration, both for 2 = 0. All panels correspond to a LCDM model with no BAOs whose initial 
conditions make their power spectrum look like a /(i?)-like one. In order to do that a running index 
has been adopted in the initial conditions (see Table 1 in [19] for details). Panels correspond to LCDM 
simulations that match with f{R) models whose |/flo| parameter^ is: 10^^ (top panels), 10~^ (middle 
panels) and 10^^ (bottom panels ). 

Black points are data from simulations, green line is the SC prediction and blue line the prediction 
of our model. In the right panel we only compare data points for 0.4 < 6i2/tt < 0.9 and in the left 
panel 0.1 h/Mpc < k < 0.5 h/Mpc in order to ensure that all the ki are smaller than a quarter of the 
Nyquist frequency. 

Considering the right panels of Fig. 5 (fc2 — 2fci = 0.4/i/Mpc), we see that our model describes 
the data within about 3%. In particular for small scales (6*12 < 0.77r), both SC and our model agree 
with the simulations data well; however at large scales {612 > 0.77r), our model fits the data points 
better. In the left panel of Fig. 5 (equilateral configuration), we see that our model shows deviations 
up to 10%. As for the standard LCDM model, the equilateral configuration is the one with the largest 
deviations. However even in this case, our formula behaves better than the SC model, especially at 
small scales. 

Therefore we conclude that our formula is general enough to be applied also to some non-standard 
LCDM models and in particular works better than SC at small scales. 

C Appendix: one-loop correction terms for the power spectrum and bis- 
pectrum 

Here we present a short description of the equations used to compute the one-loop correction in 
Eulerian perturbation theory for the bispectrum shown in Fig. 2 and 3. For a detailed description of 
PT see [27, 39]. 

®see Eq. 2.3 in [19] for a definition of fnQ 
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Figure 5. In the left panels: Q vs. k\ for equilateral configuration. In the right panels Q vs Q\iI'k for 
^2 = 2fci = 0.4/i/Mpc. From top to bottom we show different non-standard LCDM models. All the panels 
correspond to LCDM models with /(i?)-like power spectrum from [19]. Top panels correspond to matching 
to |/flo| = 10~*; middle panels to |/ho| ~ 10~^; bottom panels to l/nol = 10~®. The sub-panel shows the 
corresponding ratio between simulations and different theoretical models: SC (green points) and our work 
(blue points). In the right sub-panels, dashed lines mark 2.5% deviation, whereas in the left panels mark the 
5% deviation. The bispectrum and its error are estimated by Eq. A. 6 and A. 7 (see Appendix A). 
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Up to one-loop, the power spectrum can be expressed as, 

P(A:) = p(")(fc) + F(i)(fc) + ... (C.l) 
where p(°^(fc) = Pl(A:) is the linear term and p(^^(/j) ~ Pisik) + P22(fc) is the one loop correction. 



For Gaussian initial conditions, the one-loop term consist of two terms^, 

P22 = j d^q^^f (q,k-q)Pi(q)PL(|k-q|) (C.2) 

A3 = J^PL{k) j d3qF3^(k,q,-q)Pi(<7) (C.3) 

P22 accounts for the mode coupling between waves with wave- vectors k — q and q, whereas P13 can 
be interpreted as the one-loop correction to the linear propagator. 

Similarly to the power spectrum, the bispectrum up to one loop consists of two terms, 

i?(ki,k2,k3) = i?("Hkl,k2,k3) + P(l)(ki,k2,k3) + ... (C.4) 

For Gaussian initial conditions, the first non-zero term is the second-order contribution B^'^^ which is 
tree level correction, whereas P^^^ is the one-loop correction. The tree level term can be expressed as, 

P(0)(ki,k2,k3) = 2P|(ki,k2)PL(fci)Pi(fc2) +2 eye. perm. (C.5) 

whereas the one loop consist of four terms (only for Gaussian initial conditions), 

B«(ki,k2,k3) = P2'22(kl,k2,k3)+Bi23(ki,k2,k3)+ (C.6) 
+ Bf2^3(ki,k2,k3) +B(i4(ki,k2,k3) 

Each of these terms read as, 

i?222 = j d^qP2'(-q,q + ki)P2^(-q-ki,q-k2)P2^(k2-q,q)PL(g) x (C.7) 

X Pi(|ki+q|)Pi(|k2-q|) 
5(23 = J^^P^ki) j d3qP3'(kl,k2 - q,q)^^|(k2 - q,q)PL(|k2 - q)PL(g) + 5 perm. (C.8) 
Biia = F^i^iM) [PL{ki)Pi3{k2) + -Pl (fc2)Pi3(fci)] + 2 perm. (C.9) 
Bi,^ = j^PL{ki)PL{k2) J d^'qFliq, -q, -ki, -k2)PL(g) + 2 perm. (C.IO) 

where P/ are the symmetrized kernels. For an EdS Universe, the non-symmetric kernels read as, 



P„(qi,...,q„) = E ^r^T^^i ' ' '^7^ [(^^ + ^)"('^' ki)P„_,„(q„.+i, . . . , q„)+ (C.ll) 

+ 2/3(k,ki,k2)G'„_.„(q„i+i, . . . ,q„)] 

G„(qi,...,q„) = E ^r^T^^i ' ' ki)P„_^(q„+i, . . . , q„)+ (C.12) 

+ 2n^(k,ki,k2)G„_„(q„j+i, . . . ,q„)] 

with Fi — Gi — 1. Also, ki = qi -I- • • • -I- q™, k2 = qm+i + ■ • • + q_n, k = ki -f k2 and the functions a 
and /3 are defined as. 



kki 

fc ^(ki-k 2) 

2 



a(k,ki) = ^^ (C.13) 



/?(k,ki,k2) ^ ^ l'^' (C.14) 



^The (2n)^ in the denominator comes from the definition of the power spectrum and bispectrum in Eq. 2.1 and 2.2 
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In order to obtain the symmetric kernels one has to symmetrize them with respect to their arguments, 

K(qi.--->qn) = ^^i^„(qx(l),---,q7r(n)) (C.15) 



where the sum is taken over all the permutations tt of the set {1, . . . ,n}. 
Finally, the reduced bispectrum up to one loop can be written as. 



where 



= PL{ki)PL{k2) + PL{kl)PL{ks) + PL{k2)PL{k3) (C.17) 

E« = PL{ki)P^'Hk2) + P^'Hki)PL{k2) + PLiki)P^'\k3) + (C.18) 

+ P^'Hkl)PL{k3) + PLik2)P^'Hk3) + P(l)(fc2)PL(fc3) 

Then it is easy to show that the linear and 1-loop correction terms for Q reads as, 

r(o) 
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